A Finite State Network for Phonetic Text Processing

نویسنده

  • Edward John Garrett
چکیده

In the past, phonetic transcriptions were made using a wide variety of fonts and formats, which hampered the development of phonetic text processing tools. Today, however, the increasing number of language documentation projects making their data freely available over the Web, combined with the adoption of the Unicode Standard by linguists as "best practice" character encoding, present linguistic software developers with an unprecedented opportunity to develop powerful tools for the analysis of phonetic text. This paper describes the generation of a finite state transducer that converts text represented in the International Phonetic Alphabet into phonetic feature sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Book Reviews: Statistical Methods for Speech Recognition

Current practitioners in the area of speech recognition who are familiar with the approach of Jelinek and others will find this a compact, concise, and useful overview of the state of the art in statistical approaches to speech recognition. Readers already familiar with Rabiner and Juang (1993) will find it an excellent companion volume. Computational linguists will also find this book to be en...

متن کامل

Efficient Development of Lexical Language Resources and their Representation

Statistical approaches in speech technology, whether used for statistical language models, trees, hidden Markov models or neural networks, represent the driving forces for the creation of language resources (LR), e.g., text corpora, pronunciation and morphology lexicons, and speech databases. This paper presents a system architecture for the rapid construction of morphologic and phonetic lexico...

متن کامل

Design and analysis of a German telephone speech database for phoneme based training

Based on the Sotscheck text corpus, we developped a new corpus that was specifically optimised for training phoneme-based recognition systems. Particular attention was payed on good coverage of phone transitions. Even though the resulting corpus is only slightly enlarged, it shows an increased phonetic coverage while maintaining a good phonetic balance. Results of phonetic statistical analysis ...

متن کامل

The OGI kids² speech corpus and recognizers

We describe a corpus of children’s speech, called the OGI Kids’ Speech corpus, and a speakerand vocabularyindependent recognition system trained and evaluated with these data. The corpus is composed of both prompted and spontaneous speech from 1100 children from kindergarten through grade 10. The prompted speech was presented as text appearing below an animated character (Baldi) that produced a...

متن کامل

Manipulation in advertising text: lexical and semantic aspect

The present paper focuses on the questions of modern advertising science, structure of advertising and elements making actual manipulative influence from the addresser. Advertising encourages product sales, is an instrument of forming ethical standards, values, creating cultural values, standards and mode of behavior that is why the wide system of means for achieving aims of advertisers is need...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005